Variational quantum compiling with double Q-learning

نویسندگان

چکیده

Quantum compiling aims to construct a quantum circuit V by gates drawn from native gate alphabet, which is functionally equivalent the target unitary U. It crucial stage for running of algorithms on noisy intermediate-scale (NISQ) devices. However, space structure exploration enormous, resulting in requirement human expertise, hundreds experimentations or modifications existing circuits. In this paper, we propose variational (VQC) algorithm based reinforcement learning (RL), order automatically design VQC with no intervention. An agent trained sequentially select alphabet and qubits they act double Q-learning \epsilon-greedy strategy experience replay. At first, randomly explores number circuits different structures, then iteratively discovers structures higher performance task. Simulation results show that proposed method can make exact compilations less compared previous algorithms. reduce errors due decoherence process noise NISQ devices, enable especially complex be executed within coherence time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Reinforcement Learning with Double Q-Learning

The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether this harms performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-le...

متن کامل

Double Q-learning

In some stochastic environments the well-known reinforcement learning algorithm Q-learning performs very poorly. This poor performance is caused by large overestimations of action values. These overestimations result from a positive bias that is introduced because Q-learning uses the maximum action value as an approximation for the maximum expected action value. We introduce an alternative way ...

متن کامل

Weighted Double Q-learning

Q-learning is a popular reinforcement learning algorithm, but it can perform poorly in stochastic environments due to overestimating action values. Overestimation is due to the use of a single estimator that uses the maximum action value as an approximation for the maximum expected action value. To avoid overestimation in Qlearning, the double Q-learning algorithm was recently proposed, which u...

متن کامل

Evaluating project’s completion time with Q-learning

Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...

متن کامل

Evaluating project’s completion time with Q-learning

Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: New Journal of Physics

سال: 2021

ISSN: ['1367-2630']

DOI: https://doi.org/10.1088/1367-2630/abe0ae